46 research outputs found
Risk-aware linear bandits with convex loss
In decision-making problems such as the multi-armed bandit, an agent learns
sequentially by optimizing a certain feedback. While the mean reward criterion
has been extensively studied, other measures that reflect an aversion to
adverse outcomes, such as mean-variance or conditional value-at-risk (CVaR),
can be of interest for critical applications (healthcare, agriculture).
Algorithms have been proposed for such risk-aware measures under bandit
feedback without contextual information. In this work, we study contextual
bandits where such risk measures can be elicited as linear functions of the
contexts through the minimization of a convex loss. A typical example that fits
within this framework is the expectile measure, which is obtained as the
solution of an asymmetric least-square problem. Using the method of mixtures
for supermartingales, we derive confidence sequences for the estimation of such
risk measures. We then propose an optimistic UCB algorithm to learn optimal
risk-aware actions, with regret guarantees similar to those of generalized
linear bandits. This approach requires solving a convex problem at each round
of the algorithm, which we can relax by allowing only approximated solution
obtained by online gradient descent, at the cost of slightly higher regret. We
conclude by evaluating the resulting algorithms on numerical experiments
Regulation of ABCC6 trafficking and stability by a conserved C-terminal PDZ-like sequence
Mutations in the ABCC6 ABC-transporter are causative of pseudoxanthoma elasticum (PXE). The loss of functional ABCC6 protein in the basolateral membrane of the kidney and liver is putatively associated with altered secretion of a circulatory factor. As a result, systemic changes in elastic tissues are caused by progressive mineralization and degradation of elastic fibers. Premature arteriosclerosis, loss of skin and vascular tone, and a progressive loss of vision result from this ectopic mineralization. However, the identity of the circulatory factor and the specific role of ABCC6 in disease pathophysiology are not known. Though recessive loss-of-function alleles are associated with alterations in ABCC6 expression and function, the molecular pathologies associated with the majority of PXE-causing mutations are also not known. Sequence analysis of orthologous ABCC6 proteins indicates the C-terminal sequences are highly conserved and share high similarity to the PDZ sequences found in other ABCC subfamily members. Genetic testing of PXE patients suggests that at least one disease-causing mutation is located in a PDZ-like sequence at the extreme C-terminus of the ABCC6 protein. To evaluate the role of this C-terminal sequence in the biosynthesis and trafficking of ABCC6, a series of mutations were utilized to probe changes in ABCC6 biosynthesis, membrane stability and turnover. Removal of this PDZ-like sequence resulted in decreased steady-state ABCC6 levels, decreased cell surface expression and stability, and mislocalization of the ABCC6 protein in polarized cells. These data suggest that the conserved, PDZ-like sequence promotes the proper biosynthesis and trafficking of the ABCC6 protein. © 2014 Xue et al
Development and validation of an interpretable machine learning-based calculator for predicting 5-year weight trajectories after bariatric surgery: a multinational retrospective cohort SOPHIA study
Background Weight loss trajectories after bariatric surgery vary widely
between individuals, and predicting weight loss before the operation remains
challenging. We aimed to develop a model using machine learning to provide
individual preoperative prediction of 5-year weight loss trajectories after
surgery. Methods In this multinational retrospective observational study we
enrolled adult participants (aged 18 years) from ten prospective cohorts
(including ABOS [NCT01129297], BAREVAL [NCT02310178], the Swedish Obese
Subjects study, and a large cohort from the Dutch Obesity Clinic [Nederlandse
Obesitas Kliniek]) and two randomised trials (SleevePass [NCT00793143] and
SM-BOSS [NCT00356213]) in Europe, the Americas, and Asia, with a 5 year
followup after Roux-en-Y gastric bypass, sleeve gastrectomy, or gastric band.
Patients with a previous history of bariatric surgery or large delays between
scheduled and actual visits were excluded. The training cohort comprised
patients from two centres in France (ABOS and BAREVAL). The primary outcome was
BMI at 5 years. A model was developed using least absolute shrinkage and
selection operator to select variables and the classification and regression
trees algorithm to build interpretable regression trees. The performances of
the model were assessed through the median absolute deviation (MAD) and root
mean squared error (RMSE) of BMI. Findings10 231 patients from 12 centres in
ten countries were included in the analysis, corresponding to 30 602
patient-years. Among participants in all 12 cohorts, 7701 (753%) were
female, 2530 (247%) were male. Among 434 baseline attributes available
in the training cohort, seven variables were selected: height, weight,
intervention type, age, diabetes status, diabetes duration, and smoking status.
At 5 years, across external testing cohorts the overall mean MAD BMI was
28 kg/m (95% CI 26-30) and mean RMSE BMI was
47 kg/m (44-50), and the mean difference
between predicted and observed BMI was-03 kg/m (SD 47).
This model is incorporated in an easy to use and interpretable web-based
prediction tool to help inform clinical decision before surgery.
InterpretationWe developed a machine learning-based model, which is
internationally validated, for predicting individual 5-year weight loss
trajectories after three common bariatric interventions.Comment: The Lancet Digital Health, 202
Risk-aware linear bandits with convex loss
International audienceIn decision-making problems such as the multi-armed bandit, an agent learns sequentially by optimizing a certain feedback. While the mean reward criterion has been extensively studied, other measures that reflect an aversion to adverse outcomes, such as mean-variance or conditional value-at-risk (CVaR), can be of interest for critical applications (healthcare, agriculture). Algorithms have been proposed for such risk-aware measures under bandit feedback without contextual information. In this work, we study contextual bandits where such risk measures can be elicited as linear functions of the contexts through the minimization of a convex loss. A typical example that fits within this framework is the expectile measure, which is obtained as the solution of an asymmetric least-square problem. Using the method of mixtures for supermartingales, we derive confidence sequences for the estimation of such risk measures. We then propose an optimistic UCB algorithm to learn optimal risk-aware actions, with regret guarantees similar to those of generalized linear bandits. This approach requires solving a convex problem at each round of the algorithm, which we can relax by allowing only approximated solution obtained by online gradient descent, at the cost of slightly higher regret. We conclude by evaluating the resulting algorithms on numerical experiments
Risk-aware linear bandits with convex loss
International audienceIn decision-making problems such as the multi-armed bandit, an agent learns sequentially by optimizing a certain feedback. While the mean reward criterion has been extensively studied, other measures that reflect an aversion to adverse outcomes, such as mean-variance or conditional value-at-risk (CVaR), can be of interest for critical applications (healthcare, agriculture). Algorithms have been proposed for such risk-aware measures under bandit feedback without contextual information. In this work, we study contextual bandits where such risk measures can be elicited as linear functions of the contexts through the minimization of a convex loss. A typical example that fits within this framework is the expectile measure, which is obtained as the solution of an asymmetric least-square problem. Using the method of mixtures for supermartingales, we derive confidence sequences for the estimation of such risk measures. We then propose an optimistic UCB algorithm to learn optimal risk-aware actions, with regret guarantees similar to those of generalized linear bandits. This approach requires solving a convex problem at each round of the algorithm, which we can relax by allowing only approximated solution obtained by online gradient descent, at the cost of slightly higher regret. We conclude by evaluating the resulting algorithms on numerical experiments
Risk-aware linear bandits with convex loss
International audienceIn decision-making problems such as the multi-armed bandit, an agent learns sequentially by optimizing a certain feedback. While the mean reward criterion has been extensively studied, other measures that reflect an aversion to adverse outcomes, such as mean-variance or conditional value-at-risk (CVaR), can be of interest for critical applications (healthcare, agriculture). Algorithms have been proposed for such risk-aware measures under bandit feedback without contextual information. In this work, we study contextual bandits where such risk measures can be elicited as linear functions of the contexts through the minimization of a convex loss. A typical example that fits within this framework is the expectile measure, which is obtained as the solution of an asymmetric least-square problem. Using the method of mixtures for supermartingales, we derive confidence sequences for the estimation of such risk measures. We then propose an optimistic UCB algorithm to learn optimal risk-aware actions, with regret guarantees similar to those of generalized linear bandits. This approach requires solving a convex problem at each round of the algorithm, which we can relax by allowing only approximated solution obtained by online gradient descent, at the cost of slightly higher regret. We conclude by evaluating the resulting algorithms on numerical experiments
From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits
International audienceThe stochastic multi-arm bandit problem has been extensively studied under standard assumptions on the arm's distribution (e.g bounded with known support, exponential family, etc). These assumptions are suitable for many real-world problems but sometimes they require knowledge (on tails for instance) that may not be precisely accessible to the practitioner, raising the question of the robustness of bandit algorithms to model misspecification. In this paper we study a generic Dirichlet Sampling (DS) algorithm, based on pairwise comparisons of empirical indices computed with re-sampling of the arms' observations and a data-dependent exploration bonus. We show that different variants of this strategy achieve provably optimal regret guarantees when the distributions are bounded and logarithmic regret for semi-bounded distributions with a mild quantile condition. We also show that a simple tuning achieve robustness with respect to a large class of unbounded distributions, at the cost of slightly worse than logarithmic asymptotic regret. We finally provide numerical experiments showing the merits of DS in a decision-making problem on synthetic agriculture data
From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits
International audienceThe stochastic multi-arm bandit problem has been extensively studied under standard assumptions on the arm's distribution (e.g bounded with known support, exponential family, etc). These assumptions are suitable for many real-world problems but sometimes they require knowledge (on tails for instance) that may not be precisely accessible to the practitioner, raising the question of the robustness of bandit algorithms to model misspecification. In this paper we study a generic Dirichlet Sampling (DS) algorithm, based on pairwise comparisons of empirical indices computed with re-sampling of the arms' observations and a data-dependent exploration bonus. We show that different variants of this strategy achieve provably optimal regret guarantees when the distributions are bounded and logarithmic regret for semi-bounded distributions with a mild quantile condition. We also show that a simple tuning achieve robustness with respect to a large class of unbounded distributions, at the cost of slightly worse than logarithmic asymptotic regret. We finally provide numerical experiments showing the merits of DS in a decision-making problem on synthetic agriculture data